Skip to content

Conversation

@joefarebrother
Copy link
Contributor

This PR adds most of the results that #4051 does, but with fewer FPs resultng from user input flowing to arguments of non-shell commands.

@joefarebrother joefarebrother requested a review from a team September 17, 2020 16:01
@github-actions github-actions bot added the Java label Sep 17, 2020
@aschackmull
Copy link
Contributor

I've started a query to find the newly filtered results: https://lgtm.com/query/695684662946170257/

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inline comments, otherwise looks good. The LGTM query run should give us a sense of the impact of these changes.

@aschackmull
Copy link
Contributor

Hmm, just tried this on JDK11. Looks like there's a performance problem in ImmutableFirstArrayExpr. One indication is the size of this join_rhs:

[2020-09-18 16:17:32] (162s) Tuple counts for CommandArguments::ImmutableFirstArrayExpr#class#f#join_rhs#4:
                      7418433  ~6%     {2} r1 = SCAN Expr::Expr::getType_dispred#ff AS I OUTPUT I.<1>, I.<0>
                      193714   ~0%     {3} r2 = JOIN r1 WITH CommandArguments::ArrayOfStringType#class#f AS R ON FIRST 1 OUTPUT "java.util", "Arrays", r1.<1>
                      193714   ~0%     {2} r3 = JOIN r2 WITH Type::RefType::hasQualifiedName_dispred#bff_120#join_rhs AS R ON FIRST 2 OUTPUT R.<2>, r2.<2>
                      47266216 ~1%     {3} r4 = JOIN r3 WITH Member::Member::getDeclaringType_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, "copyOf", r3.<1>
                      1937140  ~5%     {2} r5 = JOIN r4 WITH Element::Element::hasName_dispred#ff AS R ON FIRST 2 OUTPUT r4.<0>, r4.<2>
                      93951290 ~0%     {3} r6 = JOIN r5 WITH Expr::MethodAccess::getMethod_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, 0, r5.<1>
                      93951290 ~4%     {2} r7 = JOIN r6 WITH Expr::MethodAccess::getArgument_dispred#fff AS R ON FIRST 2 OUTPUT R.<2>, r6.<2>
                                       return r7
[2020-09-18 16:17:33] (163s) Stopped the world to evict 257874211 ARRAYS at sequence stamp o+22420191
[2020-09-18 16:17:33] (163s) Starting the world again: 259025536 bytes forgotten up to a UNIMPORTANT item with sequence stamp o+7688682
[2020-09-18 16:17:34] (164s) Stopped the world to evict 257874211 ARRAYS at sequence stamp o+22420818
[2020-09-18 16:17:34] (164s) Starting the world again: 259025536 bytes forgotten up to a UNIMPORTANT item with sequence stamp o+7897450
[2020-09-18 16:17:35] (165s) Registering CommandArguments::ImmutableFirstArrayExpr#class#f#join_rhs#4/2@a8d1a3 + [] with content 27c5130ebcg5e412i2nko3rmiac80
[2020-09-18 16:17:35] (165s)  >>> Wrote relation CommandArguments::ImmutableFirstArrayExpr#class#f#join_rhs#4/2@a8d1a3 with 93951290 rows and 2 columns.

@aschackmull
Copy link
Contributor

And the first delta (i.e. iteration 2) of ImmutableFirstArrayExpr doesn't appear to terminate within any reasonable time.

@joefarebrother
Copy link
Contributor Author

I think there's a bug in your LGTM query - since DataFlow::PathNode includes the configuration, the line not fix.hasFlowPath(source, sink) is always going to be true, so it's just going to give exactly the results of the original query.

@aschackmull
Copy link
Contributor

Thanks, yes, here's the fixed version: https://lgtm.com/query/7616579728577345448/
Results are looking good.

@aschackmull aschackmull merged commit 47506a8 into github:main Sep 22, 2020
aschackmull added a commit to aschackmull/ql that referenced this pull request Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants